Data analysis system

ABSTRACT

A data analysis system ( 1 ) for displaying data facilitating visual analysis of communication transaction includes a transactions database ( 3 ) operable to cause representations of the communication transactions to be displayed on a display screen ( 13 ) by determining for each transaction a first set of control co-ordinates; determining for each transaction a set of control co-ordinates for drawing a straight line between co-ordinates associated with the source and destination associated with the communication transaction; calculating as a set of control co-ordinates for representing a transaction weighted averages of corresponding co-ordinates in the first and second set, weighted by a bundling factor; and representing each of the communication transaction as a line drawn utilizing the calculated control co-ordinates for each transaction.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation-in-part of U.S. patent applicationSer. No. 13/102,648, entitled “DATA ANALYSIS SYSTEM” and filed May 6,2011, the contents of which are incorporated in their entirety byreference.

FIELD OF THE INVENTION

The present application concerns a data analysis system.

More specifically embodiments of the present application concern methodsand apparatus for processing transaction data to identify deviations oranomalies from normal or expected patterns of transactions and to assistin the interpretation of these deviations or anomalies. Suchtransactions may include communication transactions (e.g. telephonecalls, emails, text messages, instant messages, social media etc),financial transactions, accounting transactions, insurance transactions,security trading, and security access. Such deviation may arise due tothe occurrence of significant events, system problems or failures,design mistakes, erroneous data entries or fraudulent activity.

BACKGROUND TO THE INVENTION

Current methods for detecting issues and analyzing disruptions withintelecommunications networks are often reactive, such that diagnostic andcorrective action is not initiated until after problems have beenreported. This leads to a poor experience for network users as they mustdeal with interrupted services and face lengthy issue-resolution times.The identification of deviations or anomalies from normal or expectedpatterns of communication transactions can be very useful forproactively identifying network issues before they have a significantimpact on users, and enabling the diagnostic and corrective action to beinitiated promptly. Furthermore, the ability to analyse communicationtransactions in this way provides a valuable means for accelerating boththe diagnosis and resolution of such issues.

In addition, such an analysis of communication transactions can also beused to identify system design issues in order to optimise networkconfiguration and utilisation, identify fraudulent behaviour etc.Moreover, such an analysis can also be used as a means for identifyingand even interpreting significant events (e.g. weather, social,political and economic events). For example, it is possible to use theanalysis of deviations in the expected patterns of mobiletelecommunications such as calls and text messages to identify eventssuch as crisis situations and the early signs of epidemics.

With the rapid advancement and wide-spread uptake of communicationtechnologies, there are now vast numbers of communication transactionstaking place daily. For example, worldwide there are more than 200billion emails, 4 billion text messages and 90 million tweets sent everyday. Consequently, one of the main problems faced when attempting toimplement an analysis of communication transaction data is the massiveamounts of data involved, and the incredible rate at which new data iscreated. In particular, communication transaction data is usually sosubstantial, dynamic and varied that it is extremely difficult to carryout a meaningful and conclusive analysis in a short space of time.

In view of the above an analysis system is desirable which assists withthe efficient identification of deviations or anomalies in normal orexpected patterns of transactions, entries or events, and theinterpretation or diagnosis of the cause of these deviations oranomalies.

SUMMARY OF THE INVENTION

In accordance with one aspect of the present invention there is provideda method of generating a display, displaying data representing aplurality of communication transactions, the method comprising:determining a hierarchy having a tree structure wherein leaf nodes inthe lowest level of the hierarchy correspond to sources and destinationsassociated with communication transactions to be represented;associating elements of the hierarchy with co-ordinates on a displayscreen; and representing each of the plurality of communicationtransactions by: determining for each transaction a first set of controlco-ordinates comprising the co-ordinates associated with elements in apath in the tree structure connecting the source and destinationassociated with a communication transaction via the closest commonparent in the hierarchy common to the source and destination;determining for each transaction a second set of control co-ordinatesfor drawing a straight line between co-ordinates associated with thesource and destination associated with the communication transaction;calculating as a set of control co-ordinates for representing atransaction weighted averages of corresponding co-ordinates in the firstand second set, weighted by a bundling factor; and representing each ofthe communication transactions as a line drawn utilizing the calculatedcontrol co-ordinates for each transaction.

Determining a first set of control co-ordinates may comprise determininga list of nodes on the tree structure for connecting the source anddestination associated with a communication transaction via the closestcommon parent in the hierarchy common to the source and destination andremoving the node corresponding to the closest common parent if thesource and destination for the transaction are not both child nodes of asingle parent node.

Determining a first set of control co-ordinates may also compriseappending as control co-ordinates in the set of control co-ordinates forrepresenting a transaction multiple sets of control co-ordinatesassociated with the source and destination of the transaction to berepresented.

Communication transactions may be represented as lines drawn utilizingthe calculated control co-ordinates for each transaction with eachcommunication transaction being represented by an appended series ofb-splines as defined by groups of control co-ordinates in the calculatedset of co-ordinates.

The lines corresponding to the b-splines may be determined by:determining co-ordinates for a number of points lying on the curvedefined by the appended series of b-splines; and calculatingco-ordinates for a set of quadrilaterals for representing thetransaction on the basis of the co-ordinates of the number of points.The calculation of the co-ordinates for a set of polygons may be such tocause the points lying on the curve defined by the appended series ofb-splines to lie on the midpoints of opposing ends of the quadrilateralsand the other sides of the quadrilaterals are parallel to a lineconnecting the midpoints of the opposing ends. Such quadrilaterals maythen be colored.

The coloring of such quadrilaterals may be determined based upon acriterion associated with the transaction represented by thequadrilateral such as the timing, frequency or an amount associated witha transaction. Alternatively the coloring of quadrilaterals may varyalong the length of the line drawn to represent a transaction.

Drawing lines representing each of the communication transactions maycomprise rendering each of the lines in a graphics buffer and thencombining the rendered images. Combining the rendered images maycomprise: determining maximum color values for areas where linesoverlap; determining color values for rendering lines in a constantcolor and calculating an alpha blend of the rendered lines; andutilizing the calculated maximum color values and the values of thedetermined alpha blend of constant color lines to determine the colorsto be included in a final display.

The communication transactions represented may include: various kinds ofcommunication transactions such as telephone calls, emails, textmessages, instant messages, social media messages or posts; and variouskinds of other transaction data like securities trading, insurance,electronic security access data and financial transactions such ascredit card transactions; debit card transactions, banking transactionsetc.

In accordance with another aspect of the present invention there isprovided a data analysis system for displaying data facilitating visualanalysis of communication transactions: the system comprising: atransactions database operable to store transaction records definingcommunication transactions; a display screen operable to displayrepresentations of communication transactions as lines connectingpositions associated with a source and a destination for a communicationtransaction; and a processing module operable to determine a hierarchyhaving a tree structure wherein leaf nodes in the lowest level of thehierarchy correspond to sources and destinations associated withcommunication transactions represented by transaction records stored inthe transactions database; associate elements of the hierarchy withco-ordinates on a display screen; and cause the display screen todisplay the representations of the communication transactions by:determining for each transaction a first set of control co-ordinatescomprising the co-ordinates associated with elements in a path in thetree structure connecting the source and destination associated with acommunication transaction via the closest common parent in the hierarchycommon to the source and destination; determining for each transaction asecond set of control co-ordinates for drawing a straight line betweenco-ordinates associated with the source and destination associated withthe communication transaction; calculating as a set of controlco-ordinates for representing a transaction weighted averages ofcorresponding co-ordinates in the first and second set, weighted by abundling factor; and representing each of the communication transactionsas a line drawn utilizing the calculated control co-ordinates for eachtransaction.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the present invention will now be described in detailwith reference to the accompanying drawings in which:

FIG. 1 is a schematic block diagram of a data analysis system inaccordance with an embodiment of the present invention;

FIG. 2 is a schematic illustration of a user interface representingtransactions to a user;

FIG. 3 is a flow diagram of the data processing of the data analysissystem of FIG. 1;

FIG. 4 is a flow diagram of the processing of the data analysis systemof FIG. 1 for generating display data;

FIG. 5 is a tree diagram illustrating an exemplary hierarchy;

FIG. 6 is a schematic diagram illustrating the assignment ofco-ordinates to the nodes of the tree diagram of FIG. 5;

FIG. 7 is a flow diagram of the processing of the data analysis systemof FIG. 1 to draw a connection between two positions associated withleaf nodes in an exemplary hierarchy;

FIG. 8A and FIG. 8B are schematic illustrations of the manner in whichthe illustration of connections between a plurality of points can bevaried in dependence upon a bundling factor;

FIG. 9 is a schematic illustration of a connection between two positionsassociated with leaf nodes in an exemplary hierarchy;

FIG. 10 is a flow diagram of the processing undertaken to determine aset of shapes to represent a curve;

FIG. 11 is an illustrative section of a curve to be rendered; and

FIGS. 12-17 are illustrative screen displays of the above describedsystem in use for analyzing transaction data relating to bookkeepingentries and bank transactions.

DESCRIPTION OF THE EMBODIMENTS

FIG. 1 is a schematic block diagram of a data analysis system 1 inaccordance with an embodiment of the present invention. The dataanalysis system 1 comprises a transactions database 3 arranged to storetransaction records 5; a processing module 11; and optionally anaccounts database 7 arranged to store accounts records 9. The processingmodule 11 is arranged to retrieve data from the transactions andaccounts databases 3, 5 and generate a graphic representationrepresenting the transactions. Data for rendering representations of thetransactions as lines on a display 13 generated and stored in a graphicsbuffer 14 connected to the processing module 11 and the display 13. Inaddition the processing module 11 is also responsive to user input via akeyboard 15 and a mouse 17 to enable a user to interact with the displayand select groups of transactions for processing.

In this embodiment where the data analysis system 1 is arranged tofacilitate the analysis of communication transaction data, thetransactions database 3 will be configured to store transactions datarelating to communication transactions. For fields other thancommunication transactions, the databases will or can be configuredotherwise and hold different kinds of transactional data.

In a system for analyzing communication transactions each transactionrecord 5 stored in the transactions database 3 will comprise a number ofdata fields, which in the case of communication transactions willtypically include any of the following:

-   -   Source (e.g. telephone number, email address, user identity        etc.)    -   Destination (e.g. telephone number, email address, user identity        etc.)    -   Date and Time    -   Transaction type (e.g. call, text message, email etc)    -   Size (e.g. duration, number of characters, data volume etc)    -   Cost    -   Source location information (e.g. physical address, IP address,        longitude and latitude, serving mobile base station location        etc.)    -   Destination location information (e.g. physical address, IP        address, serving mobile base station location etc.)

If required, the accounts database 7 can store details relating to theaccount or subscription information of individual users of acommunication system. For example, the accounts database 7 could storedata relating to individual accounts such as:

-   -   User identity (e.g. name, telephone number, email address,        username etc)    -   User location information (e.g. physical address etc)    -   Account number    -   Subscription details (e.g. tariff etc)

In this case, the accounts database 7 can be used in conjunction withthe transactions database 3 to determine additional informationregarding a communication transaction. By way of example, if thetransactions database 3 contained transaction records 5 relating tofixed line telephone calls, but did not specify the source anddestination locations, then the accounts database 7 could be used toperform a lookup between the telephone number of the source and/ordestination and the location information/physical address of thesource/destination. The records 5 in the transactions database 3, andoptionally the accounts database 7, enables communications to bemonitored. If unusual patterns of communication are detected, furtherinvestigation of potential issues or significant events can beinitiated. Detection is, however, difficult due to the very largevolumes of variable and highly dynamic data that are involved. It is forthis reason that it is very important how the data is processed and theresults of that processing are displayed as having the processing module11 generate display data which makes potentially anomalous transactionsmore apparent to a user and thereby greatly facilitates the detection ofissues or significant events.

As will be described in detail, in accordance with the presentapplication data representing transactions is processed in a mannerwhich enables large volumes of transaction data to be displayedsimultaneously and facilitates user selection of subsets of thetransaction data and rapid update of the display. This then enables aninvestigator to focus on transactions which share certain commonattributes for further investigation.

The analytical process, apparatus and method described in thisembodiment are designed to display and help the user detect theanomalous patterns resulting from anomalous behaviour. The patterns arethe result of a sequence of transactions or transaction flow, and in theexamples described herein the transactions take the form ofcommunications transactions. The detection of anomalies thatdifferentiate from the “normal” pattern of transactions can be expeditedby leveraging visual analytics, in which the patterns in thecommunication transactions are represented visually. The human eye andbrain can more quickly adapt to changes in visual representations ofdata then to representations where data is presented as indexes in rows,columns or tables.

The apparatus and method are designed for the purpose of visuallyrepresenting the relationships and sequences that exist within thetransactions, such that the detection of irregular activity occursfaster and more accurately as visually it stands out from the crowd ofregular transactions. The greater the amount of data which can bevisualized the more effective the analyst or observer/user can detectpatterns, and the more accurate and timely the analysis can beperformed, which is essential for communication transactions, as verylarge quantities of transactions are occurring at any given time. Thelarger the amount of data that a tool can visualize the larger theamount of time can be conceptualized thus the more stable and accuratethe pattern analysis can be.

This can lead to an operator of a telecommunications network being ableto quickly identify points of disruption or congestion within thenetwork, determine the time frame of the disruption or congestion,identify the possible cause of the disruption or congestion, and takeproactive steps to remedy the issue or counteract the problem.

To facilitate the identification of anomalous transactions, as will bedescribed, in this embodiment, transactions are illustrated by linesconnecting an origin (e.g. a base station of a mobile telecommunicationsnetwork that is serving a source) and a destination (e.g. the basestation serving the destination). The processing module 11 thengenerates a visual display which causes representations of transactionssharing similar characteristics to be bundled together.

FIG. 2 is a schematic illustration of a user interface representingtransactions to a user. In FIG. 2 curved lines connect points at theperimeter of a circle. Each of the lines is shaded from lighter todarker to indicate a direction of connection. Around the perimeterpoints are arranged into a hierarchy such as for example a hierarchy oflocations e.g. country, region, sub-region, which can correspond tolocations of base stations, telephone exchanges, email servers etc.These are illustrated by the curved sections at the perimeter of theillustration with the elements at the top of the hierarchy shown at theoutside of the circle labelled layers 0-4, the new level of thehierarchy shown as units 1-50 and the lowest level of the hierarchyshown adjacent the lines in connecting points at the perimeter of thecircle.

As indicated, transactions sharing common portions of a hierarchy arebundled together to indicate a volume of flow between two locations. Aswill be described by selecting data either by selecting groups oflocations or a subset of transactions etc a user can home in on a groupof suspect transactions for more investigation.

Additionally the display can be modified to highlight certaininformation. Thus for example in certain circumstances it may bedesirable to distinguish between the source and destination oftransactions. This could enable a subset of transactions to bedisplayed. The screen display could then be modified to color code therepresentations of the transactions to highlight some other aspect ofthe transactions such as the timing of the transactions or the frequencyor size of the transactions to make potentially anomalous patterns moreapparent.

The processing of the processing module 11 to generate a display such asis shown in FIG. 2 will now be described in detail with reference toFIGS. 3-10.

Referring to FIG. 3 which is a flow diagram of the processing undertakenby the processing module 11, as an initial step (s3-1), the processingmodule 11, in response to user input via the keyboard 15 and mouse 17initially accesses the transactions database 3 to identify transactionrecords 5 which are to be utilized to generate a display 13. In the caseof this initial step the records 5 which are to be displayed can beselected on any basis as determined by the user and indicated to theprocessing module 11 by user input. Thus for example a user could selecta group of records corresponding to a particular time period, location,or transaction type, all of which is recorded either directly orindirectly within the transaction records 5 themselves. Alternativelysome indirect measure could be utilized to select the transactions fordisplay such as by processing transaction records 5 to identify sourcesand/or destinations which for example are associated with an unusuallylow frequency of transactions or low transaction volumes.

Once the initial dataset has been determined, the processing module 11then determines (s3-2) in response to user input the hierarchy againstwhich the selected transaction data is to be displayed.

The hierarchy to be utilized could be determined in response to userinput selecting a pre-stored hierarchy from a list. One example of sucha hierarchy would be a hierarchy based on location of source and/ordestination. In such a case the various locations could be grouped bygeographic location into countries, regions and sub-regions etc.

Alternatively an artificial hierarchy could be constructed from theavailable data fields in the transaction records 5 and, when present,the account records 9. Thus for example a user might decide tocategorize transaction size (e.g. duration, data volume etc) into anumber of ranges and assign this as the top level of the hierarchy,followed for example by groups of mobile phone number prefixes for asecond level in the hierarchy, with high level location data beingassigned to a third level in the hierarchy.

Any user selected hierarchy could be utilized provided the selectedhierarchy enables individual transactions to be assigned both a sourcelocation and a destination location within the hierarchy. In the case ofa simple geographical hierarchy it will be appreciated that both thesource and the destination can utilize the same hierarchy (e.g. country,region sub-region, etc.). In other cases the hierarchies could beutilized where source locations and destination locations are determinedusing different data sets. All that is required is that the selectedhierarchy data is sufficient to enable transaction data to be identifiedwith distinct pairs of items of data.

Having entered or selected the hierarchy to be utilized to order thetransaction data to be displayed, the processing module 11 then (s3-3)proceeds to generate a display image for the currently selected data setand organizational hierarchy.

By way of example, if a number of base station transceivers (e.g.masts/antennas) in a particular region of a mobile telecommunicationsnetwork are affected by a local electrical failure, then this will leadto at least a partial decrease in the number of mobile calls and textmessages that are both sent to and sent from those base stations. Thevisual representations displayed to a user in accordance with themethods and apparatus described herein would enable the user to quicklyrecognise this pattern of unusually low communication transactionvolumes for the affected base station transceivers, and to determinethat these base station transceivers are located in the same region. Itwould therefore be apparent that there is a localised problem that iscausing this low volume of communication transactions. Moreover, if theinformation identifying the base station transceivers is supplementedwith information identifying the hub via which they are connected to theelectricity supply (e.g. the electrical substation), this factor couldbe introduced into the hierarchy. Doing so would provide a visualindication that all of the affected transceivers share a commonconnection to the electricity supply, thereby providing an indicationthis may be the likely point of failure. For example, if this type ofdetailed information is not included in the transactions database 3itself, then the system could be configured with a separate database ofsupplemental information that could be interrogated as and when theanalysis requires supplemental information (e.g. when selected by theuser).

The processing undertaken by the processing module 11 to generatedisplay data will now be explained with reference to FIGS. 4-10.

Turning first to FIG. 4 which is a flow diagram of the processingundertaken by the processing module 11, initially (s4-1) the processingmodule 11 assigns each of the elements in the hierarchy being used to ascreen location.

FIG. 5 is a tree diagram illustrating an exemplary hierarchy. In such atree diagram a route node A₀ is provided at the top of the hierarchy. Afirst set of categories, shown in the figure as A₁, B₁, C₁, forms thefirst level of the hierarchy. The next level of the hierarchy is shownas A₂, B₂, C₂, D₂, E₂, F₂, G₂. Finally a third layer of the hierarchy isshown as A₃, B₃, C₃, D₃, F₃, G₃, H₃, I₃, J₃, K₃. Thus for example if thehierarchy were to represent geographical location A₁, B₁, C₁ mightcorrespond to countries, A₂, B₂, C₂, D₂, E₂, F₂, G₂ might correspond toregions and B₃, C₃, D₃, F₃, G₃, H₃, I₃, J₃, K₃ might correspond tosub-regions.

In the present embodiment, once a hierarchy has been identified to theprocessing module, the various items on the hierarchy are assigned ascreen location. FIG. 6 is a schematic diagram illustrating theassignment of co-ordinates to the nodes of the tree diagram of FIG. 5.In FIG. 6 the various nodes of the tree of FIG. 5 have been allocatedpositions corresponding to a series of concentric circles with the routenode A₀ placed in the centre and the remaining nodes placed insuccessive concentric circles about the route node with the nodes atlower levels of the hierarchy shown in sections next to their parentnodes in the higher sections of the hierarchy.

Having assigned a set of co-ordinates for the elements of the hierarchy,the processing module then proceeds to calculate and draw a series oflines representing each of the transactions which is to be plotted. Morespecifically, first of all an initial transaction to be plotted isselected (s4-2). The processing module then (s4-3) determines a set ofcontrol-co-ordinates to represent the transaction. As will be explainedthese control co-ordinates are selected on the basis of the co-ordinatesassociated with nodes on the hierarchy and a bundling factor which causethe resultant line curve to a lesser or greater extent so that linesassociated with corresponding portions of the hierarchy are shown asbeing bundled together.

FIGS. 7 and 8 are a flow diagram of the processing of the processingmodule to determine a set of control points and a schematic illustrationof a connection between two positions associated with leaf nodes in anexemplary hierarchy.

Referring initially to FIG. 7, having selected a transaction to beillustrated on the display 13, the values for the source and destinationassociated with a line to be drawn are identified and their relativelocations on the selected hierarchy are determined.

Thus for example say a transaction is to be illustrated where the linesare to be drawn connecting positions corresponding to the location of abase station serving the source of the communication transaction and thelocation of a base station serving the destination of the communicationtransaction. In such a case, nodes on the hierarchy corresponding to thesource base station location and the destination base station locationwould be identified. The processing module 11 then determines (s7-1) theleast common ancestor (LCA) for the two identified locations within thehierarchy being used which connect the nodes corresponding to the sourcebase station location and the destination base station.

Taking the hierarchy of FIG. 5 as an example, in the case of nodes A₃and D₃, data corresponding to the hierarchy would be processed todetermine that stepping back through the tree the least common ancestorfor A₃ and D₃ would be A₁. It will be appreciated that identification ofthe least common ancestor could be determined using any conventionalprocess for determining the common parent for nodes in a tree structure.

Thus for example if the hierarchy of FIG. 5 were to correspond togeographical locations where location A₁, B₁, C₁ correspond tocountries, A₂, B₂, C₂, D₂, E₂, F₂, G₂ correspond to regions and A₃, B₃,C₃, D₃, F₃, G₃, H₃, I₃, J₃, K₃ correspond to individual towns in theexample above it would be determined that the transaction linking asource with an address associated with the town identified by A₃communicated with a destination in another town D₃ in a different regionbut in the same country as A₃.

Having identified the least common ancestor for the source anddestination points to be illustrated, the processing module 11 thenproceeds (s7-2) to generate a list for the path connecting the sourcenode to the destination node which passes via the least common ancestor.This can also be determined using conventional techniques.

Thus in the case of nodes A₃ and D₃ for the example hierarchy of FIG. 5the following path data would be determined A₃, A₂, A₁, C₂, D₃.

The processing module 11 then (s7-3) proceeds to determine whether thegenerated list contains more than three elements. If this is not thecase, this will mean that the nodes associated with the transaction tobe illustrated share the same parent node. In this case the list ofnodes is left unamended. If, however the path data contains more thanthree elements, in this embodiment the reference to the closest leastcommon ancestor is removed (s7-4) from the list.

Thus in the case of the path data A₃, A₂, A₁, C₂, D₃ and the hierarchyof FIG. 5, the path data would be modified to become A₃, A₂, C₂, D₃. Incontrast where path data for say data associated with nodes A₃ and B₃were to be determined, here the least common ancestor would beidentified as being node A₂ and path data for connecting nodes A₃ and B₃would be generated as A₃, A₂, B₃. In such a case as the path dataincludes only three elements no modification of the path data would bemade and the generated path data would remain as A₃, A₂, B₃.

After having removed the reference to the least common ancestor frompath data containing more than three elements, the path data is then(s7-5) modified by appending three copies of the first and last elementsin the list to the beginning and end of the list respectively.

Thus for example in the case of path data comprising the following listA₃, A₂, C₂, D₃, the list would be modified to become A₃, A₃, A₃, A₃, A₂,C₂, D₃, D₃, D₃, D₃.

At this stage, the co-ordinates associated with the elements in the listof path data will define a first set of control points for drawing aline comprising appended b-slines to connect the positions at thebeginning and end of the list. Such a curve will bend towards thevarious control points associated with nodes at different levels of thehierarchy and hence bend towards the co-ordinates of the variouspositions associated with elements of the hierarchy being utilized suchas is illustrated in FIG. 6.

Merely drawing a line based on the co-ordinate positions associated withthe elements in the data path would only enable a connection between twopoints to be drawn in a single way and would not enable a user to varythe extent to which lines deviate towards the various control points.The ability to vary the extent to which lines representing transactionsdeviate is advantageous as stronger deviation enables connectionsassociated with shared higher elements in the hierarchy to be groupedtogether which facilitates the selection of transactions for greaterscrutiny.

This is illustrated by FIG. 8A and FIG. 8B which are schematicillustrations of the manner in which the illustration of connectionsbetween a plurality of points can be varied in dependence upon abundling factor. In FIG. 8A transactions associated with shared elementsin the upper portions of the hierarchy are shown by lines which deviateless than in FIG. 8B. As can be seen by inspecting FIGS. 8A and 8B, thismeans that the extent to which transactions share such upper elements inthe hierarchy is less clear in FIG. 8A compared with FIG. 8B. This isachieved by bundling the representations more closely together in thecentral portions of the figures.

Bundling of lines is also used to free up screen space so structure andpatterns can become visible and more easily detected and recognized bythe user. The bundling operation is essential as the display mentionedin this embodiment can hold very large numbers of lines eachrepresenting a set of transactions. Display real estate (pixels) doesnot allow such a quantity of lines to be presented on the display screenin a single instance; rather the lines are displayed in a (virtual)overlay mode. Long before the physical limits of the computer displayare reached a user will no longer be able to easily identify individualline(s) of interest, instead the human eye and brain only perceiveclutter as shown in FIG. 8A. The bundling as shown in FIG. 8B providesstructure to the image for better human perception and comprehension andhelps the user identify the flow of transactions. In case the userwishes to see details of sub-sets or even individual transactions theuser can move his mouse to the perimeter of the line view and byhovering over the lines is able to highlight these while the other linesare faded into the background. To render individual lines visible thesystem allows for dynamic interrogation of parts of the display thatrenders the data set, as a zoom-in function.

In order to provide the ability to vary the extent to which linesrepresenting transactions are bundled together, having determined pathdata identifying a first set of control points corresponding to theco-ordinates associated with the list of path data, the processingmodule 11 then (s7-6) proceeds to determine an alternative set ofcontrol points for drawing a straight line between the end points of theline defined by the first set of control points.

FIG. 9 is a schematic illustration of a connection between two positionslabeled A and G connected by a curve shown by a thick line in FIG. 9.The curve comprises a set of appended b-splines defined by controlpoints A, B, C, D, E, F, G.

An alternative connection between the two positions A and G would be astraight line between those two points. Such a line can also berepresented by a set of appended b-splines where the control points forsuch b-splines have co-ordinates which lie along the line connecting Aand G.

An alternative set of control points for a straight line connection canbe derived from the co-ordinates associated with positions A and G andthen selecting a number of points on the line. In this embodiment thealternative control points B′, C′, D′ and E′ are determined bycalculating the positions on the line A-G nearest to the positions ofthe control points B, C, D and E respectively.

Control points for a line connecting A and G which curves to a lesser orgreater extent can then be determined (s7-7) by using a weighted averageof the co-ordinates associated with the original and the alternativecontrol points. That is to say control points for drawing a connectionbetween A and G can be selected to be at positions along any of thedotted lines shown in FIG. 9 connecting B and B′, C and C′, D and D′ andE and E′. The control points for such lines can be calculated using thefollowing equation:

Control point co-ordinates=(1−B)*(original co-ordinates+B(alternativeco-ordinates) where B is the selected bundling factor to be utilized.

If a bundling factor of 1 is selected the curved lines will utilize theoriginal co-ordinate positions as control points whereas if a bundlingfactor of 0 is selected connections will be represented by straightlines connecting the start and end points.

Overlapping groups of four of the set of generated control points canthen be used to draw a line between the positions associated with thesource and destination elements. More specifically each set of fourcontrol points can be utilized as control points for a cubic b-splinewith the entire line connecting the source and destination elementsbeing the piece wise cubic b spline formed by the concatenation of theseindividual b-splines

Returning to FIG. 4 having determined a set of control points forrepresenting a transaction as a b-spline connecting two positionscorresponding to the source and destination for the transaction, theprocessing module 11 then proceeds to determine a set of shapes (s4-4)for drawing the each b-spline using the control points.

Mathematically a b-spline is fully defined based solely on theco-ordinates associated with a set of control points. When rendering ab-spline as an image, it is necessary to break the mathematical curvewhich is to be drawn into a set of shapes which can be rendered by thecomputer. This is necessary because in order to be rendered the curvemust be given a thickness so that the rendering of the curve can beseen.

The processing to determine such a representation will now be describedwith reference to FIGS. 10 and 11 which are a flow diagram of theprocessing of the processing module 11 and an illustrative section of acurve to be rendered

To achieve such a rendering, initially (s10-1) a set of points on theline to be rendered is determined. The co-ordinates for the points canbe determined directly by processing the control co-ordinates for theb-spline for the curve to be rendered. These will comprise a set ofpoints all of which should lie at the middle of the curve which is to berendered.

FIG. 11 is an illustrative section of a curve to be rendered. In theillustration the points 50,51,52,53 lying on the thick black linecomprise points mathematically determined to correspond to the portionof the line to be rendered.

Starting with the first point 50, the processing module then (s10-2)determines co-ordinates for the edge of the line at that point. This isachieved by determining normal (N) to a vector connecting the first 50and second 51 points on the line a length of ±w/2 from the first point50 and assigning these co-ordinates to the corners 54,55 of thebeginning of the line to be rendered. In this way two co-ordinates 54,55separated by a distance w corresponding to the selected width of theline to be rendered are identified where the first selected point 50lies in the middle of the two points 54,55.

Having determined the initial end points 54, 55, the processing module11 then proceeds to calculate (s10-3) a unit vector (V₁) which bisectsthe angle formed by lines connecting the first 50 and second 51 and thesecond 51 and third points 52 which lie on the mathematical line beingrendered.

Co-ordinates for the quadrilateral representing an initial section ofthe curve are then (s10-4) determined by identifying points 56, 57 adistance ±w/(2 sine⊖₁) V₁ from the second point 51 lying on themathematical curve being rendered where ⊖₁ is the angle between the lineconnecting the first and 50 second points 51 on the curve being renderedand the vector V₁.

Having calculated co-ordinates for the quadrilateral for rendering thefirst section, the processing module 11 then checks (s10-5) to see ifthe complete line has now been rendered.

If this is not the case the processing module 11 then proceeds tocalculate (s10-3) a vector bisecting lines connecting the next two pairsof point on the mathematical curve being rendered. Thus having processedthe first 50, second 51 and third 52 points on the line the processingmodule 11 will then determine a vector V₂ bisecting the lines connectingthe second 51 and third 52 and the third 52 and fourth 53 points on thecurve being rendered.

Once this vector has been determined, the end points for the nextquadrilateral for representing the next section of the curve is then(s10-4) determined utilizing the vector V₂ and the angle between thelines connecting the second 51 and third 52 and the third 52 and fourth53 points on the curve being rendered ⊖₂ bisected by the vector V₂.After which the processing module 11 once again checks (s10-5) whetherthe end of the curve being rendered has been reached.

Returning to FIG. 4, having determined a set of quadrilaterals torepresent the line connecting the positions corresponding to the sourceand destination for a transaction, the processing module then (s4-5)determines a coloring for the line.

The coloring for the line can be determined in a number of differentways, depending on what information a user wishes to highlight. If soselected by a user, the processing module 11 may be arranged to renderlines which vary in color say from red to green to enable a user todistinguish the source and destination for a represented transaction. Insuch a case the determination for the coloring of the line segmentswould be made where the color assigned to a portion of a line wasdependent upon the portions position on the line.

Alternatively, the processing module 11 could be arranged to color codethe transactions based on some other factor. For example differentcolors could be assigned to transactions depending upon the timing ofthe transactions, the size of the transactions or the frequency withwhich particular transactions were made. Such a coloring would thenenable a user to identify transactions sharing certain criteria forfurther analysis.

Having determined the coloring to be used to color the line representinga transaction, the processing module 11 then (s4-6) renders the line tothe graphics buffer 14 using standard open GL techniques.

When rendering different lines corresponding to different transactionsthe lines could be rendered in order with earlier rendered lines beingoverwritten by later rendered lines. However where lines are colored toindicate additional information about transactions it is preferable thatwhen the renderings of different transactions are combined theprocessing module 11 determines maximum color values for any particularposition based on the rendering. Such an approach has the benefit ofhighlighting lines which differ from most transactions and hence makeoutliers more apparent.

Having rendered the currently selected transaction as a line on thedisplay, the processing module 11 then determines (s4-7) whether thefinal transaction has been reached. If this is not the case the nexttransaction is selected (s4-8) and processed (s4-3-s4-5) and rendered(s4-6) to the graphics buffer 14 before the processing module checks(s4-7) once again whether the final transaction has been reached.

When the final transaction has been processed the processing module 11then can cause the image stored in the graphics buffer 14 to bedisplayed (s4-9).

In some embodiments it may be preferable to modify the content of thegraphics buffer 14 before display. In particular where linesrepresenting transactions are rendered by determining maximum colorvalues for particular channels at points of overlap between linesrepresenting different transactions, the processing module 11 mayproceed to modify the color values to mimic some kind of alpha blendingfor the representation.

As noted above where lines are rendered which overlap one another oneapproach to deal with such areas is merely to utilize the finaloverwrite and display that information on the screen. Another approachwould be to take an alpha blend of the multiple representations. Thiswould cause areas of overlap to appear darker and hence would provide auser with information about the numbers of transactions in an area ofoverlap. However utilizing an alpha blend approach averages out colorrepresentations and hence where a transaction differs in appearance fromthe majority of transactions shown in a particular portion of the screenthat difference may not be apparent because the single outliertransaction will have a very limited influence on the average appearanceof the area of overlap.

It is for this reason that it can be preferable to render areas ofoverlap utilizing maximum color values. In this way if say for examplethe majority of transactions in a portion of the screen were to berendered green and an outlier transaction were to be rendered red theexistence of the outlier transaction would still be apparent on thescreen.

A disadvantage with such an approach is that utilizing maximum colorvalues to represent areas of overlap rather than utilizing alphablending results in images where the number of times an area isoverwritten is not apparent. Thus by utilizing just the maximum colorvalues for a position it ceases to be possible to identify the numbersof transactions rendered in a particular area of the screen.

A compromise approach, can however, be achieved which enables outliertransactions to be highlighted and for the numbers of transactionsrendered in a particular area to be made apparent. This is to renderareas of overlap selecting maximum color values and then to combine theresults with a re-rendering of the lines in solid color (i.e. black orwhite) using alpha blending. The re-rendering in solid color thenprovides data as to how the maximum color values should be altered toaccount for the numbers of lines overlapping at a particular position.

Alternatively, the coloring of lines could be determined in other ways.Frequently it will be desirable that the full range of colors or shadesshould be utilized to represent lines on an image. In such embodimentsthe extent of over-writing of particular points on the screen could bedetermined and the coloring scaled so that the most and leastover-written portions of the screen are mapped to the lightest anddarkest colors with intermediate portions of the screen being mapped tointermediate colors. In such embodiments the mapping of intermediateportions of the screen could be linear. However in some embodiments somekind of non-linear mapping may be preferable. Suitable non-linear colormappings could include for example a logarithmic or exponential mapping.Using non-linear mappings may be preferable as they would enable outliervalues to be more clearly represented.

When a representation of the selected transactions has been rendered anddisplayed on the screen a user can then input selections via thekeyboard 15 and mouse 17 to vary the display.

Typically the user input will instruct the processing module 11 tore-render the display based on an alternative selection of transactions,attributes belonging either to transactions or elements, hierarchyand/or rendering.

FIGS. 12-16 are illustrative screen displays of the above describedsystem in use for analyzing transaction data relating to mobiletelecommunications transactions.

As shown in FIG. 12, in this example the screen display is shown asbeing in two parts. A first section 100 in which transactions are shownas lines connecting sources and destinations arranged in a circle and asecond section 102 where transactions are shown as straight linesarranged in order of time or sequence associated with the transactions.

In this example the hierarchy used comprises geographical locationsassociated with base station transceivers. In the figures in the firstsection 100 of the screen the individual base station locations areshown. In the second section 102 of the screen the hierarchy used isshown above the lines representing transactions. In this section of thescreen the individual lines extend between points corresponding to thesources and destinations of transactions.

In both the first and the second sections 100,102 lines corresponding totransactions are shaded to indicate (a) selected attribute(s), forexample direction from source to destination, or age of the flow of thetransaction. In embodiments this may be best shown using color. FIG. 12shows an illustration of 8777 lines corresponding to transactions inboth of the sections 100,102 of the screen.

The illustration of transaction data in the manner shown in FIG. 12facilitates user investigation and manipulation of the transaction data.In particular the first section 100 of the screen using the selectedhierarchy illustrates communications flowing between source anddestination base stations. Whereas the second section 102 provides auser with information relating to the timing of individual transactions.The ordering of the lines in the second section 102 could be basedeither on an identifier or some other data associated with eachindividual base station transceiver or on the basis of time dataassociated with transactions. The display may include a representationof a scale to identify the range of transactions being displayed. Thusfor example an indication of a range of base station transceivers or arange of dates or times might be displayed.

Using the keyboard 15 or the mouse 17, a user can then drill down withinthe available information to investigate further.

Thus for example a user might input criteria for selecting transactionslimited to a certain time period, size or frequency to be displayed andonly transactions meeting such criteria would then be extracted from thetransactions database 3 for display on the screen. The screen display 13could be arranged to display user selectable menus to facilitate suchselection or entry of data indicating the selection criteria to be used.

Alternatively a subset of the displayed transactions could be made byidentifying a section of the screen using a pointer under the control ofthe mouse 17. In such a case it would be necessary to identify whichtransactions resulted in the rendering of display data to a particularsection of screen. This could be achieved by checking the display datarendered to the graphics buffer 14 which was utilized to render aparticular display.

A problem with such an approach arises where transactions are selectedby drawing a line on the screen to identify transactions to beinvestigated. This is because it is possible that the line will notintersect with pixels which are rendered. This is particularly a problemwhere an oblique line is drawn as it is possible that a rendering of anoblique line will pass through another line at a different angle withoutthere being any pixels in common. Such a problem can be avoided byutilizing the identification of a line with a pointer to define a box ofa number of pixels thicknesses and identify renderings which occuranywhere in that box. However this can result in too many transactionsbeing identified as being of interest.

An alternative approach would, however, be to utilize the drawing of aline to identify an angle and re-render a representation of the currentdisplay in the graphics buffer 14 without updating the display where therendering applied a rotation to the display based on the angle of theline. This would then ensure that the selected line corresponds to a rowor column of pixels in the re-rendered image. In such a case anyrenderings of transactions which correspond to the selected portion ofthe row or column could be reliably identified.

In the case of the transactions shown in FIG. 12, it is apparent fromthe Figure that the transactions shown in the second section 102 of thescreen naturally form 4 blocks 104-110.

FIG. 13 is an illustration of how the screen would be updated if thefirst 104 of the blocks 104-110 in FIG. 12 were to be selected fordisplay. This would restrict reduce the numbers of transactions to beillustrated from 8777 to 1499. Reviewing the content of the displayshown in FIG. 13 would then reveal in this example that the blockconsisted mainly of corresponding communication transactions between aparticular pair of base stations with the greatest numbers oftransactions shown by the thickest bundles of lines 112.

FIG. 14 is an illustration of how the screen would be updated if thesecond 104 of the blocks 104-110 in FIG. 12 were to be selected fordisplay. This would cause a different set of transactions to behighlighted.

As is shown in FIG. 14, when restricting the rendering of transactionsjust to transactions in this block, a further subdivision 116 of thetransactions becomes apparent in the second 102 section of the display.

In a similar way this subdivision can be selected and the resultsdisplayed. The results of making such a selection are illustrated inFIG. 15

Illustrating communication transaction data makes unusual patterns ofdata more apparent to a user. A number of such unusual patterns areillustrated in FIG. 16.

FIG. 16 which is an illustration of the results of restricting thetransactions of FIG. 12 to the third block of transactions 108. Howeverin contrast to FIG. 14 where all the transactions have a common origin,in FIG. 16 one of the transactions 118 is shown as being connected to adifferent origin base station and a number of the transactions 120 areshown as having both a different source and a different destination.Additionally in FIG. 16, a group of transactions 122 is apparent in thesecond section 102 of the screen which might require additionalinvestigation.

Any such transactions may potentially be due to a fault and hencewarrant further investigation and to facilitate such investigations,selection of such individual or groups of transactions could causeadditional information about such transactions to be displayed.

In addition to selecting groups of transactions for furtherinvestigation and other means for selecting subsets of transactions tobe viewed could also be provided. A user might be able to selectportions of the hierarchy and eliminate them from the transactions beingconsidered. In such an example any transactions relating to the excludedportions of the hierarchy would not be rendered. Additionally, theprocessing module 11 could be arranged to reassign co-ordinate positionsto the remaining portions of the hierarchy to enable the remainingtransactions to be better distinguished.

An additional way to facilitate investigation and selection oftransactions would be to enable the processing module 11 to beresponsive to user input to alter the bundling factor to be utilized torender an image. As shown in FIGS. 8A and 8B varying the bundling factorvaries the extent to which transactions sharing higher elements in thehierarchy are grouped together. This would then facilitate on-screenselection of such groups of transactions.

A further way in which the display could be altered would be for theprocessing module 11 to be responsive to user input to alter theco-ordinates associated with the selected hierarchy. Thus for exampleinstead of associating elements in a hierarchy with positions arrangedin a series of concentric circles such as is illustrated in FIG. 6,hierarchy elements could be arranged in a series of lines. Such anarrangement would be particularly suitable when analyzing data where thesource and destination data for transactions was to be analyzed againstdifferent hierarchies.

Another alternative would be to alter the rules for coloring images tofor example color lines based on a time ordering rather than in a mannerto distinguish between sources and destinations.

In the previous examples transaction data has been illustrated as beingrendered as a set of connections between points arranged around theperimeter of a circle. It will be appreciated that the arrangement oftransactions is determined by the assignment of co-ordinates to controlpoints for drawing lines and hence by the assignment of co-ordinates toelements in the hierarchy being used.

Instead of assigning a set of co-ordinates to elements in a hierarchyarranged in a series of concentric circles, an alternative approachwould be to arrange such co-ordinates to lie in a series of straightlines. FIG. 17 is an illustration of a set of transactions renderedusing a set of co-ordinates assigned to hierarchy members where suchco-ordinates are arranged in a series of straight lines.

In FIG. 17, a separate set of labels 200, 202 is shown identifying ahierarchy for sources and for destinations. Co-ordinates for theelements in the hierarchies are then assigned positions where theco-ordinates lie on one of a set of parallel the lines 204 with elementshighest in the hierarchy lying on lines closest to the centre of the set204 and the leaf nodes are associated with points lying on linesimmediately adjacent the labels.

It will be appreciated that illustration of transactions in this way isparticularly suitable for illustration of transactions where thehierarchies to be utilized for source and destination elements aredifferent.

As a further alternative, if the source and destination of thecommunication transactions are defined by physical location information,such as the longitude and latitude of the base station transceiversinvolved in a mobile telecommunication transaction, or the addressassociated with a fixed line telephone call, then these coordinatescould be applied to a map display. Communication transactions could thenbe rendered onto such a geospatial view as arcs.

By way of example, arcs representing communication transactions could berendered as quadratic Bezier curves, with the direction of thecommunication encoded clockwise. This could be achieved by determiningthe vector from the source to the destination, computing the orthogonalto this vector and giving the orthogonal vector a length of half of thatof the source to destination vector. This orthogonal vector can then bepositioned halfway between the source and destination, and the end pointof this orthogonal vector used as the control point for a quadraticBezier curve. Given that the computation of the orthogonal vector inthis way takes into account that vector direction is from the source tothe destination, the clockwise direction of the communicationtransaction is automatically inferred. Markers representing the sourceand the destination can then be rendered onto the geospatial view/map.For example, these markers could take the form of white dots with aradial gradient from white opaque (innermost) to full transparent blue(outermost). Finally, additive blending techniques could be used torender the sources, the destinations, and the arcs on the geospatialview/map in order to create a ‘glow’ effect, wherein the intensityprovides an indication of the density of sources, destinations, andtransactions.

In addition, when displayed on a geospatial view, color can be used todepict whether the number of communication transactions, or some othermeasure associated with an arc, is lower or higher when compared to aprevious point in time (e.g. the previous day). For example, when themeasure to be depicted by a color has decreased for a particular arc,this could be represented by a red color, whilst an increase in themeasure could be represented by a green color, and any preciselyoverlapping arcs where one is increasing (green) and the otherdecreasing (red) could be rendered as yellow. Moreover, the opacity ofthe arcs can also be used to indicate some value associated with eacharc relative to that of the other arcs (e.g. the number of transactions,size or cost of transactions etc). Both the coloring and opacity of thearcs can therefore emphasize the communication transactions that arelikely to be of most significance. Representing communicationtransactions in this way enables users to quickly recognise patterns ofcommunication transactions to and/or from specific locations, and allowsusers to differentiate between increasing and decreasing levels ofcommunication traffic.

Such a geospatial view could also be supplemented with additional viewformats for further exploration of the communication transactionsdisplayed on the geospatial view.

For example, the geospatial view could be supplemented with a line graphthat is rendered to illustrate the change in an aggregate measureassociated with the transactions over time. To do so, time could beplotted on the y-axis of the line graph with one or more aggregatemeasures (e.g. number of communication transactions, size ofcommunication transaction, etc) plotted on the x-axis. Such a line graphwould provide a further means by which a user could identify significantpoints in time.

As a further example, the geospatial view could also be supplementedwith a bar graph representing the contribution of each communicationchannel (i.e. source-destination pair) to an aggregate measure at aselected point in time, wherein the relative size of each bar wouldindicate the relative contribution. Color coding of each bar could thenbe used to depict whether the measure for that communication channel islower or higher when compared to a previous point in time (e.g. theprevious day).

As a yet further example, the geospatial view could also be supplementedwith a matrix view in which individual sources and/or destinations ofcommunications transactions are plotted on the vertical or y-axis, withtime being represented on the horizontal or x-axis. For each sourceand/or destination, the variance in the measure is then rendered ontothe matrix view as a heat map, in which the color of a particular areain the matrix represents the value of the measure at the correspondingpoint in time for the corresponding source/destination. Such a matrixview provides a further means for a user to identify sources and/ordestinations that display similar behaviour over time. Furthermore, byapply clustering methods to the rows shown in the matrix, the rows ofthe matrix can be re-ordered such that sources and/or destinations thatdisplay similar behaviour are grouped together.

The ability to analyse and understand patterns of communicationtransactions can provide a means for identifying the occurrence ofsignificant events (e.g. weather, social, political and economicevents), as such events will typically have an impact on communicationtraffic in the affected areas. For example, the occurrence of asignificant event such as a protest or a clash between communities islikely to result in increased volumes of communication transactions inthe affected location, probably due to individuals informing each otherof the disruption and contacting friends and family.

Moreover, the data analysis system could be configured such that, uponselection of a particular communication transaction or a particularlocation on a particular date (e.g. by double-clicking on an area of thegeospatial view), a default web search platform is opened and anautomatically constructed search string is entered into the web searchplatform in order to search for information on events that may correlatewith an identified pattern of communication transactions. The system cantherefore provide a means for interpreting patterns in the communicationtransactions, so as to provide an insight into the potential cause ofthe pattern.

Alternatively, or in addition, the data analysis system could beprovided with or connected to an events database that can be configuredto store information regarding significant events (e.g. environmental,weather, seismic, social, political and economic events etc). Forexample, this information would typically include the date, time, andlocation of the event, together with descriptive information explainingthe event. The data analysis system could then be configured such that,upon selection of a particular communication transaction or a particularlocation on a particular date (e.g. by double-clicking on an area of thegeospatial view), a database query is automatically generated in orderto search for events in the events database that could correlate with/beassociated with the rendered communication transactions. For example,when a user selects the representation of a communication transaction ora bundle of communication transactions, the data analysis system couldautomatically query the database for events that occurred at or aroundthe same time and/or location as the selected communicationtransactions. The data analysis system could then display the results ofthe query to the user, thereby assisting the user in identifying anyevents that could be the cause of, or otherwise provide an explanationfor, any anomalous communication transactions.

Although in the above described a system has been described whichfacilitates the analysis of communications transactions and an examplehas been illustrated which facilitates the analysis of mobilecommunications transactions within a network, it will be appreciatedthat the system could be adapted for the imaging, mapping and analysisof any type of transactional data where transactions can be associatedwith a source and a destination and the source and destination can beassociated with nodes in a hierarchy. Thus for example the describedsystem could be adapted to facilitate the review and interrogation ofmany types of financial or banking data.

Communication data could be combined with data from other sources tofacilitate investigation of transactions. Thus for example wheretransactions can be associated with individuals, communicationtransaction data could be supplemented with data identifying otherinteractions between those individuals such as financial transactions.

In the above embodiments transactions are described as being representedas lines where data for representing transactions is rendered to agraphics buffer 14 connected to a display. It will be appreciated thatthe representation of transactions in such a manner facilitates rapidupdate of a display 13. More specifically as described transactions areillustrated by a set of primitive elements rendered to a graphics buffer14. When a subset of transactions is to be displayed instructions toupdate a display can be limited to an instruction to cause only data forthe selected transactions to be utilized to update the display 13. Datafor the remaining transactions can, however, still remain within thegraphics buffer 14 for use in rendering subsequent displays. Further bycausing transactions to be represented in the form of a set ofquadrilaterals corresponding to a spliced cubic b-spline, the processingnecessary to represent a transaction can be undertaken by a dedicatedgraphics processor enhancing the speed of the system.

As described in the above embodiments, the coloring of linesrepresenting relationships could be determined based upon some kind ofmapping between the number of times a portion of a screen wasover-written with the results being scaled so that the full range ofcolors which can be represented was apparent on the screen. It will beappreciated that in addition or as an alternative to modifying the colorof a line as it appears in an image, the thickness of a line could bescaled in a similar way. Thus for example where many similartransactions occur a thicker line could be utilized to represent thosetransactions.

As with the coloring of lines, the selection of a suitable linethickness could be made by determining the most and least commontransactions which are to be represented and mapping those transactionsto the thickest and thinnest lines and then representing transactions ofintermediate frequency with lines of intermediate thickness.

In such embodiments it may be preferable to map the frequencies based onsome kind of scaling such as a logarithmic or power function so thatlines corresponding to less frequent transactions are represented usingthicker lines than their actual frequency would suggest.

In some embodiments it may be preferable to arrange all transactions tobe drawn with lines of the same thickness. In other embodiments it maybe preferable to exaggerate the relative thickness of less frequenttransactions and minimize the relative thickness representing morefrequent transactions.

One approach to achieving such a scaling would be to utilizing a powerlaw such that a value x is mapped to a value x^(P) where the value p wasselected on the basis of representation to be utilized for the greatestvalue. In such an embodiment, no scaling would occur if p was chosen tobe equal to 0, positive values would cause more frequent transactions tobe empathized whereas negative values to increase the representation ofless frequent values.

Similarly a scaling function may also be utilized to determine the sizeof the representation of the sections representing the origin ordestination for a transaction. Thus for example rather than choosing thelocation of control point to be evenly distributed along an edge or acircumference the spacing of the control points could be based on thenumber of transactions to be represented. Again such a value could bebased either directly on that number or based on a mapping which scaledthe number either to emphasize or de-emphasize the more frequenttransactions.

It will also be appreciated that the selection of colors, linethicknesses and the arrangement of control points at the circumferenceor edge of an image could be determined based on variables other thantransaction frequency. Suitable variables could be determined directlyfrom data associated with transactions or alternatively based onvariables derived by processing such data.

Although in the examples described in detail hierarchies having threelevels have been illustrated and discussed, it will be appreciated thatin other embodiments hierarchies having greater more levels could beused.

Although the embodiments of the invention described with reference tothe drawings comprise computer apparatus and processes performed incomputer apparatus, the invention also extends to computer programs,particularly computer programs on or in a carrier, adapted for puttingthe invention into practice. The program may be in the form of source orobject code or in any other form suitable for use in the implementationof the processes according to the invention. The carrier may be anyentity or device capable of carrying the program.

For example, the carrier may comprise a storage medium, such as a ROM,for example a CD ROM or a semiconductor ROM, or a magnetic recordingmedium, for example a floppy disc or hard disk. Further, the carrier maybe a transmissible carrier such as an electrical or optical signal whichmay be conveyed via electrical or optical cable or by radio or othermeans.

When a program is embodied in a signal which may be conveyed directly bya cable or other device or means, the carrier may be constituted by suchcable or other device or means.

Alternatively, the carrier may be an integrated circuit in which theprogram is embedded, the integrated circuit being adapted forperforming, or for use in the performance of, the relevant processes.

What is claimed is:
 1. A method of generating a display, displaying datarepresenting a plurality of communication transactions, the methodcomprising: determining a hierarchy having a tree structure wherein leafnodes in the lowest level of the hierarchy correspond to sources anddestinations associated with communication transactions to berepresented; associating elements of the hierarchy with co-ordinates ona display screen; and representing each of the plurality ofcommunication transactions by: determining for each transaction a firstset of control co-ordinates comprising the co-ordinates associated withelements in a path in the tree structure connecting the source anddestination associated with a communication transaction via the closestcommon parent in the hierarchy common to the source and destination;determining for each transaction a second set of control co-ordinatesfor drawing a straight line between co-ordinates associated with thesource and destination associated with the communication transaction;calculating as a set of control co-ordinates for representing atransaction weighted averages of corresponding co-ordinates in the firstand second set, weighted by a bundling factor; and representing each ofthe communication transactions as a line drawn utilizing the calculatedcontrol co-ordinates for each transaction.
 2. The method of claim 1wherein determining for each transaction a first set of controlco-ordinates comprising the co-ordinates associated with elements in apath in the tree structure connecting the source and destinationassociated with a communication transaction via the closest commonparent in the hierarchy common to the source and destination comprises:determining a list of nodes on the tree structure for connecting thesource and destination associated with a communication transaction viathe closest common parent in the hierarchy common to the source anddestination and removing the node corresponding to the closest commonparent if the source and destination for the transaction are not bothchild nodes of a single parent node.
 3. The method of claim 1 furthercomprising appending as control co-ordinates in the set of controlco-ordinates for representing a transaction multiple sets of controlco-ordinates associated with the source and destination of thetransaction to be represented.
 4. The method of claim 3 whereinrepresenting each of the communication transaction as a line drawnutilizing the calculated control co-ordinates for each transactioncomprises representing each communication transaction as an appendedseries of b-splines as defined by groups of control co-ordinates in thecalculated set.
 5. The method of claim 4 wherein representing eachcommunication transaction as an appended series of b-splines comprises:determining co-ordinates for a number of points lying on the curvedefined by the appended series of b-splines; and calculatingco-ordinates for a set of quadrilaterals for representing thetransaction on the basis of the co-ordinates of the number of points. 6.The method of claim 5 wherein the calculation of the co-ordinates for aset of polygons is such to cause the points lying on the curve definedby the appended series of b-splines to lie on the midpoints of opposingends of the quadrilaterals and the other sides of the quadrilaterals areparallel to a line connecting the midpoints of the opposing ends.
 7. Themethod of claim 5 further comprising representing said transactions bycoloring said quadrilaterals.
 8. The method of claim 7 wherein thecoloring of the quadrilaterals is determined based upon a criterionassociated with the transaction represented by the quadrilateral.
 9. Themethod of claim 8 wherein the criterion associated with a transactioncomprises a criterion associated with any of: the timing, frequency oramount associated with a transaction.
 10. The method of claim 7 whereinthe coloring of said quadrilaterals varies along the length of the linedrawn utilizing the calculated control co-ordinates.
 11. The method ofclaim 1 wherein representing each of the communication transactions as aline drawn utilizing the calculated control co-ordinates for eachtransaction comprises rendering each of the lines in a graphics bufferand then combining the rendered images.
 12. The method of claim 11wherein combining the rendered images comprises: determining maximumcolor values for areas where lines overlap; determining color values forrendering lines in a constant color and calculating an alpha blend ofthe rendered lines; and utilizing the calculated maximum color valuesand the values of the determined alpha blend of constant color lines todetermine the colors to be included in a final display.
 13. The methodof claim 1 wherein the communication transactions comprise transactionsselected from the group comprising: telephone calls, emails, textmessages, instant messages, social media messages or posts, andcommunications between computers.
 14. The method of claim 1 wherein thehierarchy is geospatial such that the tree structure defines ahierarchical arrangement of geographical areas and leaf nodes correspondto geographical locations of sources and destinations.
 15. The method ofclaim 14 wherein the elements of the hierarchy are associated withco-ordinates on the display screen that correspond to relativegeospatial coordinates of the elements.
 16. The method of claim 1wherein the data representing a plurality of communication transactionsincludes a time at which each communication transaction occurred. 17.The method of claim 16 wherein only those transactions that occurredwithin a defined period of time are represented on the display.
 18. Themethod of claim 1 and further comprising, upon a selection of arepresentation of a communication transaction, querying a database ofevent information using data associated with the communicationtransaction, and displaying the event information provided in responseto the query.
 19. A data analysis system for displaying datafacilitating visual analysis of communication transactions: the systemcomprising: a transactions database operable to store transactionrecords defining communication transactions; a display screen operableto display representations of communication transactions as linesconnecting positions associated with a source and a destination for acommunication transaction; and a processing module operable to determinea hierarchy having a tree structure wherein leaf nodes in the lowestlevel of the hierarchy correspond to sources and destinations associatedwith communication transactions represented by transaction recordsstored in the transactions database; associate elements of the hierarchywith co-ordinates on a display screen; and cause the display screen toshow the representations of the communication transactions by:determining for each transaction a first set of control co-ordinatescomprising the co-ordinates associated with elements in a path in thetree structure connecting the source and destination associated with acommunication transaction via the closest common parent in the hierarchycommon to the source and destination; determining for each transaction asecond set of control co-ordinates for drawing a straight line betweenco-ordinates associated with the source and destination associated withthe communication transaction; calculating as a set of controlco-ordinates for representing a transaction weighted averages ofcorresponding co-ordinates in the first and second set, weighted by abundling factor; and representing each of the communication transactionsas a line drawn utilizing the calculated control co-ordinates for eachtransaction.
 20. A computer readable medium storing computerimplementable instructions which when implemented by a programmablecomputer cause the computer to: determine a hierarchy having a treestructure wherein leaf nodes in the lowest level of the hierarchycorrespond to sources and destinations associated with communicationtransactions to be represented; associate elements of the hierarchy withco-ordinates on a display screen; and represent each of the plurality ofcommunication transactions by: determine for each transaction a firstset of control co-ordinates comprising the co-ordinates associated withelements in a path in the tree structure connecting the source anddestination associated with a communication transaction via the closestcommon parent in the hierarchy common to the source and destination;determine for each transaction a second set of control co-ordinates fordrawing a straight line between co-ordinates associated with the sourceand destination associated with the communication transaction; calculateas a set of control co-ordinates for representing a transaction weightedaverages of corresponding co-ordinates in the first and second set,weighted by a bundling factor; and represent each of the communicationtransactions as a line drawn utilizing the calculated controlco-ordinates for each transaction.